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Abstract 



o 

^ ' We introduce an approach to detecting inconsistencies in large biological networks by using Answer 

^ \ Set Programming (ASP). To this end, we build upon a recently proposed notion of consistency be- 

tween biochemical/genetic reactions and high-throughput profiles of cell activity. We then present an 
approach based on ASP to check the consistency of large-scale data sets. Moreover, we extend this 
►^ , methodology to provide explanations for inconsistencies by determining minimal representations of 

I ■ conflicts. In practice, this can be used to identify unreliable data or to indicate missing reactions. 

C^ ■ KEYWORDS: answer set programming, bio-informatics, consistency, diagnosis 
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1 Introduction 



- T— I ■ 

^^ ■ Molecular biology has seen a technological revolution with the establishment of high- 



throughput methods in the last years. These methods allow for gathering multiple orders 
of magnitude more measured data than was procurable before. Furthermore, there is an 
increasing number of biological repositories on the web, such as KEGG, Biomodels, Re- 
actome, MetaCyc, and others, incorporating thousands of biochemical reactions and ge- 
netic regulations. However, both measurements as well as biological networks are prone 
to considerable incompleteness, heterogeneity, and mutual inconsistency, which makes it 
highly non-trivial to draw biologically meaningful conclusions in an automated way. As a 
consequence, appropriate representation and powerful reasoning tools are needed to model 
complex biological systems, in the face of incompleteness and inconsistency. 

In this paper, we deal with the analysis of high-throughput measurements in molecular 
biology, like microarray data or metabolic profiles. Up to now, it is still common practice to 
use expression profiles merely for detecting over- or under-expressed genes under specific 
conditions, leaving the task of making biological sense out of a multitude of gene identi- 
fiers to human experts. However, many efforts have also been made to better utilize high- 
throughput data, in particular, by integrating them into large-scale models of transcriptional 
regulations or metabolic processes JFriedman et al. 2000l|KIamt and SteUing 2006 1. 
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One possible approach consists of investigating the compatibility between experimental 
measurements and knowledge available in reaction databases. This can be done by using 
formal frameworks, for instance, the ones developed in JGutierrez-Rios et al. 20031 1 and 
(Siegelet al. 2006)l. A crucial feature of this methodology is its ability to cope with qualita- 



tive knowledge (for instance, reactions lacking kinetic details) and noisy data. In what fol- 
lows, we rely upon the so-called Sign Consistency Model (SCM) due to (Siegelet al. 2006|. 



SCM imposes constraints between experimental measurements and a graph representation 
of cellular interactions, called an influence graph (ISoule 20031) . Such a graph provides an 
over-approximation of the actual biological model, where an "influence" is modeled by a 
disjunctive causal rule. This is particularly well-suited for dealing with incomplete (miss- 
ing reactions) or unreliable (noisy data) information. 

Building on the SCM framework, we develop declarative techniques based on Answer 
Set Programming (ASP) jBaral 20031 IGelfond 2 008) to detect and explain inconsistencies 
in large data sets. This approach has several advantages. First, it allows us to formulate 
biological problems in a declarative way, thus easing the communication with biological 
experts. Second, although we do not detail it here, the rich modeling language facilitates in- 
tegrating different knowledge representation and reasoning techniques, like abduction, ex- 
planation, planning, prediction, etc., in a uniform and transparent way (cf. JGebser et al. 201 Oi l 
for such extensions). And finally, modern ASP solvers are based on advanced Boolean 
constraint solving technology and thus provide us with highly efficient inference engines. 
Apart from modeling the aforementioned biological problems in ASP, our major concern 
lies with the scalability of the approach. To this end, we apply our methods to the gene- 
regulatory network of yeast (IGuelzim et al. 20021 ISudarsanam et al. 2000J and, moreover, 
design an artificial yet biologically meaningful benchmark suite indicating that an ASP- 
based approach scales well on the considered class of applications. Notably, to the best of 
our knowledge, the functionalities we provide go beyond the ones of the only comparable 
approach JGuziolowski et al. 20091 1. 

To begin with, we introduce SCM in Section|2] Section|3]gives the syntax and semantics 
of ASP used in our application. In Section |4] we develop an ASP formulation of check- 
ing the consistency between experimental profiles and influence graphs. We further extend 
this approach in Section|5]to identifying minimal representations of conflicts if the exper- 
imental data is inconsistent with an influence graph. In Section |6] we describe simple yet 
effective techniques for input reduction along with a connectivity property that is used to 
refine the encoding presented in Section|5] Section|2]is dedicated to an empirical evaluation 
of our approach along with an exemplary case study on yeast. For making our methods eas- 
ily accessible, an available web service is presented in Section|8] Section|9]concludes the 
paper with a discussion and outlook on future work. Finally, [Appendix "A| and [Appendix B] 
contain proofs of soundness and completeness for our problem formulations in ASP. 

2 Influence Graphs and Sign Consistency Constraints 

Influence graphs ( ISoule 2003l l are a common representation for a wide range of dynamical 
systems. In the field of genetic networks, they have been investigated for various classes of 
systems, ranging from ordinary differential equations tSoule 20 06) to synchronous ( Remy et al. 2008| l 



and asynchronous (.Richard et al. 2004, ) Boolean networks. Influence graphs have also been 
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Figure 1 . Simplified model of operon lactose in E. coli, represented as an influence graph. 
The vertices represent either genes, metabolites, or proteins, while the edges indicate the 
regulations among them. Edges with an arrow stand for positive regulations (activations), 
while edges with a tee head stand for negative regulations (inhibitions). Vertices G and Le 
are considered to be inputs of the system, that is, their signs are not constrained via their 
incoming edges. 



introduced in the field of qualitative reasoning ( Kuipers 1994 1 to describe physical systems 



where a detailed quantitative description is unavailable. In fact, this has been the main mo- 
tivation for using influence graphs for knowledge representation in the context of biological 
systems. 

An influence graph is a directed graph whose vertices are the input and state variables 
of a system and whose edges express the effects of variables on each other 

Definition 2.1 (Influence Graph) 

An influence graph is a directed graph (V, E, a), where F is a set of vertices, E a set of 

edges, and a : E -^ {+i-} ^ (partial) labeling of the edges. 

An edge j — > i means that the variation of j in time influences the level of i. Every edge 
j — )► i of an influence graph can be labeled with a sign, either + or -, denoted by a{i, i), 
where + (-) indicates that j tends to increase (decrease) i. An example influence graph is 
given in Figure[Tl it represents a simplified model of the operon lactose in E. coli. 

In SCM, experimental proflles are supposed to come from steady state shift experiments 
where, initially, the system is at steady state, then perturbed using control parameters, and 
eventually, it settles into another steady state. It is assumed that the data measures the 
differences between the initial and the final state. Thus, for genes, proteins, or metabolites, 
we know whether the concentration has increased or decreased, while quantitative values 
are unavailable, unessential, or unreUable. By /i(i), we denote the sign, again either + or -, 
of the variation of a species i between the initial and the final condition. One can easily 
enhance this setting to also considering null (or more precisely, non-significant) variations. 



by exploiting the concept of sign algebra ( Kuipers 1994 1. 
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Table 1 . Some vertex labelings (reflecting measurements of two steady states) for the in- 
fluence graph depicted in Figure[Tl unobserved values indicated by question mark '?'. 

Given an influence graph (as a representation of cellular interactions) and a labeling 
of its vertices with signs (as a representation of experimental profiles), we now describe 
the constraints that relate both. Informally, for every non-input vertex i, its variation /i(i) 
ought to be explained by the influence of at least one predecessor j of i in the influence 
graph. Thereby, the influence of j on i is given by the sign iJ,{j)(7{j, i) G {+,-}, where the 
multiplication of signs is derived from that of numbers. Sign consistency constraints can 
then be formalized as follows. 

Definition 2.2 (Sign Consistency Constraints) 

Let {V, E, a) be an influence graph and ^ : V —^ {+, -} a (partial) vertex labeling. 

Then, (V, E, a) and /i are consistent, if there are some total extensions a' : E ^ {+, -} 
ofcrand/u' : V -^ {-H,-} of /x such that /i'(i) is consistent for each non- input vertex z g V, 
where /i'(i) is consistent, if there is some edge j —T-iin E such that /i'(i) = m'OJc'O, «)• 

Note that labelings a and /i of vertices and edges, respectively, are admitted to be partial. 
This occurs frequently in practice where the kind of an influence may depend on environ- 
mental factors or experimental data may not include all elements of a biological system. 
In order to decide whether a partially labeled influence graph and a partial experimental 
profile are mutually consistent, we thus consider the possible totalizations of them. If at 
least one total edge and one total vertex labeling (extending the given labelings) are such 
that the signs of all non-input vertices are explained, it is sufficient for mutual consistency. 
Table [T] gives four vertex labelings for the influence graph in Figure [T] Total labeling 
III is consistent with the influence graph: the variation of each vertex (except for input 
vertex Le) can be explained by the effect of one of its regulators. For instance, in /ii, LacY 
receives a positive influence from cAMP-CRP as well as a negative influence from Lad, 
the latter accounting for the decrease of LacY. The second labeling, /i2, is not consistent: 
LacY receives only negative influences from cAMP-CRP and Lad, and its increase cannot 
be explained. Partial vertex labeling /13 is consistent with the influence graph in Figure [T] 
as setting the signs of L^, LacY, LacZ, A, and cAMP-CRP to +, -, -, -, and +, respectively, 
extends /is to a consistent total labeling. In contrast, fi4 cannot be extended consistently. 



3 Answer Set Programming 

This section provides a brief introduction to ASP, a declarative problem solving paradigm 
offering a rich modeling language ( Lparse Manual IGebser et al. 2009ab along with highly 



efficient inference engines based on Boolean constraint solving technology ( Giunchiglia et al. 2006 
IGebser et al. 2009cl IDrescher et al. 20081) . The basic idea of ASP is to encode a problem 
as a logic program such that its answer sets represent solutions. 
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In view of our application, we take advantage of the elevated expressiveness of dis- 
junctive programs, capturing problems at the second level of the polynomial hierarchy 
dEiter and Gottlob 19951 1. A disjunctive logic program P is a finite set of rules of the form 

ai; . . . ;a; <- a;+i, . . . ,am,not a^+i, . . . ,not an , (1) 

where a.i is an atom for 1 < i < n. A rule r as in ([T]i is called a fact if I = m = 
n = 1, and an integrity constraint if / = 0. Let head{r) — {ai, . . . , a/} be the head 
of r, body{r) = {a;+i, . . . , am, not Om+i, . • • , not a„} be the body of r, as well let 
body{r)+ = {a;+i, . . . ,a„J and body{r)- = {a,„+i, . . . ,a„}. 

An interpretation is represented by the set of atoms that are true in it. A model of a 
program P is an interpretation in which all rules of P are true according to the standard 
definition of truth in propositional logic. Apart from letting ';' and ',' stand for disjunc- 
tion and conjunction, respectively, this implies treating rules and default negation 'not' 
as implications and classical negation, respectively. Note that the (empty) head of an in- 
tegrity constraint is false in every interpretation, while the empty body is true in every 
interpretation. Answer sets of P are particular models of P satisfying an additional stabil- 
ity criterion. Roughly, a set X of atoms is an answer set, if for every rule of form ([T]), X 
contains a minimum of atoms among ai , . . . , a; whenever a/^i , . . . , a,„ belong to X and 
no a„j+i, . . . , a„ belongs to X. However, the disjunction in heads of rules, in general, is 
not exclusive. Formally, an answer set X of a program P is a C -minimal model of 

{head{r) ^ body{r)+ \ r £ P, body{ry n X = 0} . 

For example, program {a; 5 ■(— . c;d^a, not b. -(—6.} has answer sets {a, c} and {a, d}. 

Although answer sets are usually defined on ground (i.e., variable-free) programs, ASP 
allows for non-ground problem encodings, where schematic rules stand for their ground in- 
stantiations. Grounders, such as gringo dOebser et al. 2009al ) and Iparse (Lparse Manual I, 



are capable of combining a problem encoding and an instance (typically a set of ground 
facts) into an equivalent ground program, which is then processed by an ASP solver We 
follow this methodology and provide encodings for the problems considered below. 

4 Checking Consistency 

We now come to the first main question addressed in this paper, namely, how to check 
whether an experimental profile is consistent with a given influence graph. Note that, if 
the profile provides us with a sign for each vertex of the influence graph, the task can be 
accomplished simply by checking whether each non-input vertex receives at least one in- 
fluence matching its variation. However, as soon as the experimental profile has missing 
values (which is very likely in practice), the problem becomes NP-hard (iVeber et al. 20041 ). 
In fact, a Boolean satisfiability problem over clauses Ci , . . . , Cm and variables xi , . . . , a;„ 
can be reduced as follows: introduce unlabeled input vertices xi, . . . , x„, non-input ver- 
tices Ci , . . . , Cm labeled +, and edges Xj -^ Ci labeled + (-) if Xj occurs positively (nega- 
tively) in Ci. It is not hard to check that the labeling of Ci , . . . , Cm by + is consistent with 
the obtained influence graph iff the conjunction of Ci , . . . , Cm is satisfiable. 

We next provide a logic program such that each of its answer sets matches a consistent 
extension of vertex and edge labelings. Our encodings as well as instances are available 
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at jBioASP Toolsl l. The program for consistency checking is composed of three parts, de- 
scribed in the following subsections. 



4.1 Problem Instance 

An influence graph as well as an experimental profile are given by ground facts. For each 
species i, we introduce a fact vertex{i), and for each edge j -^i, a fact edge{j, i). If s e 
{+, -} is known to be the variation of a species i or the sign of an edge j — )> i, it is expressed 
by a fact observedV{i, s) or observedE{j, i, s), respectively. Finally, a vertex i is declared 
to be input via a fact input{i). 

For example, the negative regulation Lad — > LacY in the influence graph shown in Fig- 
ure[T]and observation + for Lad (as with /i3 in Table[T]) give rise to the following facts: 

vertejc(LacI). 
vertex(LeLcY). 

edge{Lacl,LacY)- (2) 

observedV {had, +). 
observedE{Lacl, LacY, -). 

Note that the absence of a fact of form observedV{LacY , s) means that the variation of 
LacY is unobserved (as with /.13). In (|2]), we use Lad and LacY as names for constants 
associated with the species in Figure [T] but not as first-order variables. Similarly, for uni- 
formity of notations, + and - are written in (|2|i for constants identifying signs. 



4.2 Generating Solution Candidates 

As mentioned above, our goal is to check whether an experimental profile is consistent 
with an influence graph. If so, it is witnessed by total labelings of the vertices and edges, 
which are generated via the following rules: 

labelV{V, +); labelV{V, -) ^ vertex{V). 
labelE{U, V, +);labelE{U, V, -) ^ edge{U, V). ^ ' 

Moreover, the following rules ensure that known labels are respected by total labelings: 

labelV{V, S) ^ observedV{V, S). 
labelE{U, V, S) i~ observedE{U, V,S). *• "* 

Note that the stability criterion for answer sets demands that a known label derived via 
a rule in (|4|i is also derived via (O, thus, excluding the opposite label. In fact, the dis- 
junctive rules used in this section could actually be replaced with non-disjunctive rules 
via "shifting" dGelfond et al. 199"n iJ^I given that our first encoding results in a so-called 
head-cycle-free (HCF) ( |Ben-Eliyahu and Dechter 1994| l ground program. However, simi- 
lar disjunctive rules are also used in Section |5] where they cannot be compiled away. Also 



^ Alternatively, one could also use cardinality constraints (cf. jLparse Manual} ), which would however preclude 
a comparison with dlv in Section|7] 
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note that HCF programs, for which deciding answer set existence stays in NP, are recog- 
nized as such by disjunctive ASP solvers jLeone et al. 20061 IDrescher et al. 2008i) . Hence, 
the purely syntactic use of disjunction, as done here, is not harmful to efficiency. 

The following ground rules are obtained by combining the schematic rules in (O and (|4|l 
with the facts in (|2]l: 

labelV{Lacl,+);labelV{Lacl,-) <— vertex(LacI). 
labelV{LacY,+);labelV(LacY,-) <— vertex(LacY). 

/a^e/£'(LacI,LacY,+); Zfl/7eZ£'(LacI,LacY,-) ^ eiige(LacI,LacY). (5) 

labelV{Lacl,+) <— observedV{LacI,+). 
labelE{Lacl,LacY,-) <— observedE{Lacl,LacY,-). 

One can check that the program consisting of the facts in (|2]l and the rules in (|5]l admits 
two answer sets, the first one including labe IV {LacY,+) and the second one including 
labe IV {LacY,-). On the remaining atoms, both answer sets coincide by containing the 
atoms in (|2]l along with labelV{LacI, +) and labelE(Lacl, LacY, -). 

4.3 Testing Solution Candidates 

We now check whether generated total labelings satisfy the sign consistency constraints 
stated in Definition 12.21 requiring an influence of sign s for each non-input vertex i with 
variation s. We thus define receive{i, s) to indicate that i receives an influence of sign s: 

receive{V, +) f- labelE{U, V, S), labelV{U, S). 

receive{V, -) <- labelE{U, V, S), labelV{U, T),S ^T. *• "* 

Inconsistent labelings, where a non-input vertex does not receive any influence matching 
its variation, are then ruled out by integrity constraints of the following form: 

-!— labelV{V, S), not receive(V, S), not input{V). (7) 

Note that the schematic rules in ^ and (|7]) are given in the input language of grounder 
gringo ( Gebser et al. 2009al l. This allows us to omit an explicit listing of some "domain 



predicates" in the bodies of rules, which would be necessary when using Iparse ( Lparse Manual i. 
At jBioASP Tools!) , we provide encodings for gringo and also (more verbose ones) for 
Iparse. 

Starting from the answer sets described in the previous subsection, the included atoms 
labelE{Lacl, LacY, -) and labelV{LacI, +) allow us to derive receive{LacY, -) via a ground 
instance of the second rule in (|6]l, while receive {LacY, +) is not derivable. After adding 
receive{LacY, -), the solution candidate containing labelV{LacY, -) satisfies the ground 
instance of the integrity constraint in (JTl obtained by substituting LacY for V and - for S. 
Assuming Lad to be an input, as it can be declared via fact input{LacI), we thus obtain an 
answer set containing labelV{LacY, -), expressing a decrease of LacY. In contrast, since 
rece;Ve(LacY, +) is underivable, the solution candidate containing labelV{LacY, +) vio- 
lates the following ground instance of ^: 

■It- labelV {hacY ,+), not rece/ve(LacY, H-), not input{hacY). 

That is, the solution candidate with labelV{LacY, +) does not pass the consistency test. 
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4.4 Soundness and Completeness 

By letting T((y. _E, a), ji) denote the set of facts representing the problem instance induced 
by an influence graph {V.E^a) and a vertex labeling ji, and Pc the logic program con- 
sisting of the rules given in (O, (|4|i, ^, and (|7]), respectively, we can show the following 
soundness and completeness results. 

Theorem 4. 1 {Soundness) 

Let (y, E, a) be an influence graph and ^ -.V ^ {+, -} a (partial) vertex labeling. 
If there is an answer set of Pq U r(( V, _E, a), /i), then (V, E, a) and jjl are consistent. 

Theorem 4.2 {Completeness) 

Let {V, E, a) be an influence graph and fi : V ^ {+, -} a (partial) vertex labeling. 
If (V, E, a) and /i are consistent, then there is an answer set of Pq U r^iV, E, cr), /i). 

The following correspondence result is immediately obtained from Theorem l4. Il and l4!2l 

Corollary 4.3 {Soundness and Completeness) 

Let (y, E, a) be an influence graph and /i : V — > {+, -} a (partial) vertex labeling. 
Then, {V, E, a) and fi are consistent iff there is an answer set of Pc U t{{V, E, a), /^). 

5 Identifying Minimal Inconsistent Cores 

In view of the usually large amount of data, it is crucial to provide concise explanations 
whenever an experimental profile is inconsistent with an influence graph (i.e., if the logic 
program given in the previous section has no answer set). To this end, we adopt a strat- 
egy that was successfully applied on real biological data JGuziolowski et al. 20071) . The 
basic idea is to isolate minimal subgraphs of an influence graph such that the vertices and 
edges cannot be labeled consistently. This task is closely related to extracting Minimal Un- 
satisfiable Cores (MUCs) jDershowitz et al. 2006l l in the context of Boolean satisfiability 
(SAT). In allusion, we call a minimal subgraph of an influence graph whose vertices and 
edges cannot be labeled consistently a Minimal Inconsistent Core (MIC), whose formal 
definition is as followso 

Definition 5.1 {Minimal Inconsistent Core) 

Let (y, E, a) be an influence graph and ^ -.V ^ {-H, -} a (partial) vertex labeling. 
Then, a subset W^ of V^ is a Minimal Inconsistent Core (MIC), if 

L for all total extensions a' : E -^ {+i-} of <^ ^nd ^' : V ^ {+:"} of /i, there is 
some non-input vertex i E W such that /i'(i) is inconsistent, and 

2. for every W' C W, there are some total extensions a' : E ^ {+, -} of a and 
^' : V -^ {+, -} of /i such that /i'(i) is consistent for each non-input vertex i G W . 

^ We note that verifying a MUC is D''-complete jDershowitz et al. 20061 [Papadimitr iou and Yannakakis 1 982) , 
and the same applies to MICs in view of the reduction of SAT described in Section |4] However, solving a 
decision problem is not sufficient for our application because we also need to provide MIC candidates to 
verify. As regards checking inconsistency of an (a priori unknown) MIC candidate, we are unaware of ways to 
accomplish such a co-NP test in non-disjunctive ASP without destroying the candidate at hand. 
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Figure 2. A partially labeled influence graph and a MIC consisting of A and D. 

To encode MICs, we make use of three important observations made on Definition 15.11 
First, the inherent inconsistency of a MICs vertices stipulated in the first condition must 
be implied by the MIC and its external regulators, while vertices not connected to the MIC 
cannot contribute anything. Moreover, the second condition on proper subsets prohibits 
the inclusion of an input vertex in a MIC, as it could always be removed without affect- 
ing inherent (in)consistency of the remaining vertices' variations. Finally, for establishing 
consistency of all proper subsets of a MIC, it is sufficient to consider subsets excluding a 
single vertex of the MIC, given that their consistency carries forward to all smaller subsets. 

For illustration, consider the influence graph and the MIC in Figure ID One can check 
that the observed simultaneous increase of B and D is not consistent with the influence 
graph, but the reason for this might not be apparent at first glance. However, once the MIC 
consisting of A and D is extracted, we see that the increase of B implies an increase of A, so 
that the observed increase of D cannot be explained. Note that the elucidation of inherent 
inconsistency provided by a MIC takes its vertices along with their regulators into account, 
the latter being incapable of jointly explaining the variations of all vertices in the MIC. 

We next provide an encoding for identifying MICs, where a problem instance, that is, an 
influence graph along with an experimental profile, is represented by facts as specified in 
Section l4n The encoding then consists of three parts: the first generating MIC candidates, 
the second asserting inconsistency, and the third verifying minimality. 



5.7 Generating MIC Candidates 

The generating part comprises rules in (HI for deriving known vertex and edge labels. In 
addition, it includes the following rules: 



active (V); inactive {V) 

edgeMIC{U, V) 
vertexMIC(U) 
vertexMICiV) 

labelV{V,+); labelV{V,-) 
labelE{U, V, +);labelE{U, V, -) 



vertex{V), not input{V). 

edge{U, V), active{V). 

edgeMIC{U,V). 

actively). 

vertexMIC{V). 
edgeMIC{U,V). 



(8) 



The first rule permits guessing non-input vertices forming a MIC candidate. Such vertices 
are marked as active. The subgraph of the influence graph consisting of the active vertices, 
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their regulators, and the connecting edges provides the context of the MIC candidate |f| The 
vertices and edges contributing to this subgraph are identified via vertexMIC and edgeMIC. 
The guessing of (unobserved) vertex and edge labels is restricted to them in the last two 
rules of ([8]l. Finally, note that the rules in (|4|l propagate known labels also for vertices and 
edges not correlated to the MIC candidate, viz., to the active vertices. This does not incur 
additional combinatorics; rather, it reduces derivations depending on MIC candidates. 



5.2 Testing for Inconsistency 

By adapting the methodology used in dEiter and Gottlob 19951 l. the following subprogram 
makes sure that the active vertices cannot be labeled consistently, taking (implicitly) into 
account all possible labelings for them, their regulators, and connecting edgeso 

opposite{U, V) ^ labelE{U, V, -), labelV{U, S),labelV{V, S). 
opposite{U, V) ^ labelE{U, V, +), labelV{U, S), labelV{V, T), S ^T. 

bottom ^r~ active(V), opposite(U, V) : edge{U, V). 

-i— not bottom. 

(") 
labelV{V, +) ^ bottom, vertex{V). 

labelV{V,-) ■<— bottom, vertex[V). 

labelE{U, V, +) ^ bottom, edgeiJJ, V). 

labelE{U, V, -) ^ bottom, edgeiJJ, V). 

In this (part of the) encoding, opposite{U, V) indicates that the influence of regulator U 
on V is opposite to the variation of V. If all regulators of an active vertex V have such an 
opposite influence, the sign consistency constraint for V is violated, in which case atom 
bottom along with all labels for vertices and edges are derived. Note that the stability 
criterion for an answer set X imposes that bottom and all labels belong to X only if the 
active vertices cannot be labeled consistently. Finally, integrity constraint ■(— not bottom 
necessitates the inclusion of bottom in any answer set, thus, stipulating an inevitable sign 
consistency constraint violation for some active vertex. 

Reconsidering our example in Figure |2] the ground instances of ^ permit guessing 
active{A.) and acnVe(D). When labeUng A with + (or assuming labelV{A, +) to be true), we 
derive opposite{A, D) and bottom, producing in turn all labels for vertices and edges. Fur- 
thermore, setting the sign of A to - (or labelV{A,-) to true) makes us derive opposite{B, A), 
which again gives bottom and all labels for vertices and edges. We have thus verified that 
the sign consistency constraints for A and D cannot jointly be satisfied, given the observed 
increases of B and D. That is, active vertices A and D are sufficient to explain the incon- 
sistency between the observations and the influence graph. 



^ In Definition 15. II (in)consistency is cliecked only for tlie (non-input) vertices in a MIC, while other vertices' 
variations do not need to be explained. Hence, guessing unobserved vertex (and edge) labels can be restricted 
to vertices belonging to or connected to the MIC, which reduces combinatorics. 

■* In the language of gringo (and Iparse), the expression opposite(U, V) : edge(U, V) used below refers to the 
conjunction of all ground atoms opposite{j, i) for which edge{j, i) holds. 
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5.3 Testing for Minimality 

It remains to be verified whether the sign consistency constraints for all active vertices are 
necessary to identify an inherent inconsistency. This test is based on the idea that, excluding 
any single active vertex, the sign consistency constraints for the other active vertices should 
be satisfied by appropriate labelings, which can be implemented as follows: 

labelV'iW, V, +);labelV'{W, V, -) ^ active{W),vertexMIC{V). 
labelE'{W, U, V, +) ; labelE'{W, U, V, -) ^ active{W) , edgeMIC{U, V) . 

labelV'iW, V, S) ^ active{W), observedV{V, S). 
labelE'iW, U, V, S) ^ active{W), observedE{U, V, S). (10) 

receive'iW, V, +) 4- labelE'{W, U, V, S),labelV'{W, U,S),V ^W. 
receive'iW, V,-) 4- labelE'{W, U, V, S),labelV'{W, U, T),V ^W,S ^ T. 

^ labelV'iW, V, S),active{V),V ^ W, not receive'{W, V, S). 

This subprogram is similar to the consistency check encoded via the rules in (|3]l, (|4|i, Q, 
and (|7]i. However, sign consistency constraints are only checked for active vertices, and 
they must be satisfiable for all but one arbitrary active vertex W. In fact, labelings such that 
the variations of all active vertices but W are explained witness the fact that W cannot be 
removed from a MIC candidate without re-establishing consistency. As W ranges over all 
(non-input) vertices of an influence graph, each active vertex is taken into consideration. 
Regarding computational complexity, recall from Section |4] that checking consistency is 
NP-complete. As a consequence, one cannot easily identify conditions to select a particular 
witness for consistency of a MIC candidate minus some vertex W, and so we do not encode 
any such conditions. This leads to the potential of multiple answer sets comprising the same 
MIC but different witnesses, in particular, if many vertices and edges belong to the context 
of the MIC. 

For the influence graph in Figure |2] it is easy to see that the sign consistency constraint 
for A is satisfied by setting the sign of A to +, expressed by atom labelV'{T>, A, +) in the 
ground rules obtained from the above encoding part. In turn, the sign consistency constraint 
for D is satisfied by setting the sign of A to -. This is reflected by atom labelV'{A,A,-), 
allowing us to derive receive'{A, D, +). That is, the ground instance of the above integrity 
constraint containing labelV'{A, D, +) is satisfied. The fact that atoms labelV'{D, A, +) and 
labelV'{A, A, -), used for explaining the variation of either A or D, respectively, disagree 
on the sign of A also shows that jointly considering A and D yields an inconsistency. 

5.4 Soundness and Completeness 

Similar to Section l4~4l we can show the soundness and completeness for our MIC extrac- 
tion encoding Pd, consisting of the rules in (IHl, (O, (|9]l, and (fTOl l. respectively. 

Theorem 5.1 (Soundness) 

Let (V, E, a) be an influence graph and /i : F — > {+, -} a (partial) vertex labeling. 
If X is an answer set of Pq U t{{V,E, a-), ij,), then {i \ active{i) 6 X} is a MIC. 

Theorem 5.2 (Completeness) 
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Let {V, E, a) be an influence graph and ^ : V ^ {+, -} a (partial) vertex labeling. 

If W Q V is a MIC, then there is an answer set X of Pd U r((y, E, a),fi) such that 
{i I active{i) e X} = W. 

The following correspondence result is immediately obtained from Theorem IS . ll and lJ!2l 



Corollary 5.3 (Soundness and Completeness) 

Let {V, E, a) be an influence graph and /i : V — > {+, -} a (partial) vertex labeling. 

Then, W C V is n MIC iff there is an answer set X of Pd U t{{V, E, ct),^) such that 
{i I active{i) e X} = W. 

As mentioned above, several answer sets may represent the same MIC because witnesses 
needed for minimality testing are not necessarily unique. 

6 Refinements 

In this section, we detail two encoding extensions aiming at the improvement of grounding 
and solving efficiency. First, input reduction checks for some simple cases to identify and 
distinguish uncritical vertices. Second, background knowledge about MICs' connectivity 
can be exploited to more precisely render potential MIC candidates. 

6.1 Input Reduction 

It is not unlikely in practice that biological networks include simple tractable substructures 
or that parts of experimental observations are easily explained. Dealing with such particular 
cases before doing complex computations (like checking consistency or finding MICs) is 
therefore advisable. Given an influence graph (V, E, a) and a partial vertex labeling ji 
capturing experimental data, we below describe conditions to identify vertices that can 
always be labeled consistently. Such vertices can then be marked as (additional) inputs to 
exclude their sign consistency constraints from consistency checking and to make explicit 
that they cannot belong to any MIC. Any of the following conditions is sufficient to identify 
a vertex i as effectively unconstrained: 

L There is a regulation i^fiinE such that (7{i, i) = +, that is, i supports its variation. 

2. There is a regulation j — 5- i in _E such that cr(j, i) is undefined. In fact, undetermined 
regulations are used in practice to model influences that vary, e.g., relative to en- 
vironmental conditions. Any variation of the target i of such a regulation can be 
explained by assigning the appropriate label to j — > i (w.r.t. the label of j). 

3. There are regulations j — s-i, fc ^-i in £' such that /^(j)a(j,i)=+ and/i(fc)cr(fc, i)=-- 
That is, any variation of i is already explained by the given observations. 

4. An observed variation /i(i) of i is explained if there is some regulation j — >■ i in _B 
such that iJ,{j)(j{j; i) — n{i)- Any further regulations targeting i can be ignored. 

5. If for all regulations i^ k in E, we have that k is an input, then the variation of i 
is insignificant for its targets. In this case, if i is unobserved (/i(i) is undefined) and 
target of at least one regulation j ^i-i in E, we can assign an appropriate label to i 
(w.r.t. the labels of j and j —?' i) without any further conditions. 
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Figure 3. A partially labeled influence graph with uncritical vertices surrounded by dots. 

6. There is a regulation j ^i in E such that j is unobserved (/i(j) is undefined), an 
input, and all targets k ^ iof j {j ^-k belongs to E) are inputs. Without any further 
conditions, we can assign an appropriate label to j for explaining the variation of i. 

The reduction idea is to mark a vertex i as additional input, if it meets one of the above 
conditions. Since the two last conditions inspect inputs, they may become applicable to 
further vertices once inputs are added. Hence, checking the conditions and adding inputs 
needs to be done exhaustively. As we see below, this can easily be encoded in ASP. 

Reconsidering the influence graph and partial observations in Figure |2] we see that ver- 
tex B receives an influence from D matching its observed increase. Thus, the fourth con- 
dition applies to akeady explained vertex B. Moreover, vertex E is unobserved and does 
not regulate anything. That is, the fifth condition applies to E, and its variation can simply 
be picked from influences it receives from A, C, and D. After establishing that E can be 
labeled consistently, we find that C does not regulate any critically constrained vertex. Ap- 
plying again the fifth condition, we notice that the variation of C is actually insignificant. 

Figure [3] shows the situation resulting from the identification of uncritical vertices by 
iteratively applying the above conditions. The fact that only A and D are critically con- 
strained tells us that only they can belong to a MIC. As a consequence, the MIC contain- 
ing A and D, shown on the right-hand side of Figured is the only one in this example. 

The aforementioned idea to mark uncritical vertices as input can be encoded as follows: 

obs{V) ^ observedV{V,S). 
getiy, +) ^ observedE(U, V, S), observedV{U, S). 
get{V,-) 4- observedE{U, V, S), observedv[u, T), S^T. 

input{V) <— observedE{V,V,+). 

inputiy) <— edge{U, V), not observedE{U, V, +), not observedE{U , V,-). 

inputiy) ^ get{V,+),get{V,-). 

input{V) <- observedV(V, S),get{V, S). 

inputiyV) <r- edgeiU, V), input{W) : edge{V, W), not obs{V). 

input{V) <- edge{U, V), input{W) : edge{U, W) : W ^ V, input{U), not obs{U). 

AuxiUary predicates obs and get are used to exhibit whether either variation has been 
observed for a vertex and whether a particular influence is received for certain, respectively. 
The last six rules check the described conditions (in the same order) and mark a vertex as 
input if one of them applies. Importantly, the above rules are stratified and thus yield a 
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B 





Figure 4. A partially labeled influence graph and flie graph (T/[{A,D}],_E[{A, D}]). 

unique set of derived input vertices. This allows us to perform the reduction efficiently 
within grounding, without deferring to any procedural implementation external to ASP. 

The situation shown in Figure [3] is reflected by the reduction encoding deriving atoms 
input{R), input{C), and input{E) from an instance (cf. Section 1431 1 corresponding to the 
depicted influence graph and observed variations. Consistency checking and MIC identifi- 
cation (cf. Section|4]and|5]l can then focus on the remaining non-input vertices A and D. 

6.2 Exploiting Strongly Connected Components for MIC Extraction 

In what follows, we introduce a connectivity property of MlCs that can be used to further 
refine the encoding presented in Section|5] Incorporating additional background knowledge 
into the problem encoding is straightforward (as soon as such knowledge is established). In 
practice, ancillary (and actually redundant) conditions may significantly narrow and thus 
speed up both the grounding and the solving process. 

MIC Connectivity Property. For analyzing interactions within a MIC, we make use of a 
graph described in the following. Let {V, E, a) be an influence graph and /i : 1/ — )► {+, -} 
be a (partial) vertex labeling, and let D{ii) denote the set of vertices labeled by /i. For a 
setW (^V of vertices, we define a graph (y[M^], _E[VF]) by: 

V[W] = WU{j\{j^i)eE,ieW} 

E[W] = {ij^^)\{J^^)eE,^eW}U{{^^J)\{j^^)eE,^eW,j<^D{^l)}. 

Theconstructionof (T^[W],_E[VF]) is based on the ideathataregulator j of some i ^ W is 
connected to i via its sign consistency constraint, and a connection in the opposite direction 
applies if j is unlabeled by fi. In fact, given some total extensions a' : E -^ {+, -} of a and 
n' : V ^ {+, -} of /Lt, we can check a matching influence of j on i by iJ,'{i) = IJ-'{j)ct'{j, i) 
or equivalently by /x'(j) = p'{i)a'{j,i). That is, provided that iJ,{j) is undefined, /Lt'(i) 
constrains fJ,'{j) by contraposition whenever i does not receive a matching influence from 
any other regulator than j. This observation motivates the inclusion of inverse edges from 
vertices in W to regulators unlabeled by /i in i?[Ty]. 

For illustration, the right-hand side of Figure |4] shows graph (y[{A,D}],£'[{A,D}]) 
resulting from the partially labeled influence graph on the left-hand side. The single reg- 
ulator B of A is labeled, and thus there is no inverse edge from A to B in i?[{A, D}]. On 
the other hand, A is an unlabeled regulator of D, and so £^[{A,D}] includes an inverse 
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edge from D to A. The addition of this edge turns the subgraph of (y [{A, D}], E[{A, D}]) 
induced by A and D into a strongly connected component. In view that A and D belong to 
a MIC (as discussed in Section|5]l, we below show that this connectivity is not by chance. 



Theorem 6.1 (MIC Connectivity) 

Let (V, E, a) be an influence graph and fi : V ^ {+, -} a (partial) vertex labeling. 

If W^ C 1/ is a MIC, then all vertices in W belong to the same strongly connected 

component in (y[M^], £'[M^]). 

The proof is omitted in view of space limitations and can be obtained from the authors. 



Optimized MIC Encoding. We now apply Theorem 16. II to improve the basic MIC extrac- 
tion encoding (cf. Section |5]l in two aspects: adding (redundant) constraints for search 
space pruning and adding positive body literals for reducing grounding efforts. The fol- 
lowing rules pave the way by determining the (non-trivial) strongly connected components 
in {V, E[V]) as an over-approximation of the ones in (F[W^], -^[VF]) for any W CV: 

edges{U, V) ■(— edge{U, V), not input{V). 

edges(V, U) <— edge{U, V)^ not input{V), not observedV{U, +), not observedV{U, -). 

reach{U, V) ^ edges{U, V). (11) 

reach{U, V) <— edges{U, W),reach{W, V),vertex{V). 

cycle{U, V) ^ reach{U, V), reach{V, U), U ^ V. 

The first rule simply collects edges whose targets are not input, while the second rule adds 
edges in the inverse direction for unobserved regulators. Reachability w.r.t. the so obtained 
graph is determined via the third and the fourth rule. Finally, predicate cycle indicates 
whether two (distinct) vertices reach each other in {V, E[V]) relative to an influence graph 
(y, E, a) and a (partial) vertex labeling /i. In fact, if two vertices belong to a MIC W C V, 
then mutual reachability in (y[iy], i?[VK]) implies the same in (y,i?[y]), in view that 
V[W] C V and E[W] C -E[y]. Conversely, if two vertices do not reach each other in 
{V, E[V]), then they cannot jointly belong to any MIC. 

The over-approximation of potential MICs provides an easy means to prune the search 
space by adding the following integrity constraint: 

^ active{U), active{V), U < V, not cycle{U, V). (12) 

The constraint makes the fact explicit that distinct vertices of a MIC must reach each other 
in {V, E\V]), and it immediately refutes MIC candidates that do not satisfy this condition. 
After making use of Theorem l6.1l to narrow search, we now shift the focus to grounding. 
As a matter of fact, the quadratic space complexity of the minimality test's ground instan- 
tiation, as encoded in ( fTol ). is a major bottleneck in scaling. The knowledge about potential 
pairwisely connected vertices in MICs, represented by integrity constraint (fT2l) . also allows 
us to include positive body literals in order to restrict the scope of minimality tests: 
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labelV\W, V, +);labelV'{W, V, -) ^ active{W),activeiV), cycle{V, W). 
labelV'iW, U, +);labelV'{W, U, -) ^ active{W), edgeMIC{U, V), cycle{V, W). 
labelE'{W, U, V, +);labelE'{W, U, V, -) ^ active{W), edgeMIC{U, V), cycle{V, W). 

labelV'iW, V, S) ^ active(W),observedV{V, S),cycle{V, W). 
labelV'iW, U, S) ^ active{W), observedV{U, S),edge{U, V), cycle{V, W). (13) 
labelE'iW, U, V, S) ^ active{W), observedE{U, V, S), cycle{V, W). 

receive'{W, V, +) ^ labelE'{W, U, V, S), labelV'iW, U, S). 
receive'iW, V, -) ^ labelE'{W, U, V, S), labelV'{W, U, T), S ^T. 

^ labelV'iW, V, S), activeiV), cycleiV, W), not receive'iW, V, S). 

In comparison to ( fTot . the extra condition cycleiV, W) in the bodies of the first three rules 
establishes that labels used for testing minimality are guessed only for pairs W and V of 
vertices that can potentially jointly belong to a MIC. The same restriction is used in the next 
three rules forwarding observed vertex and edge labels, but now limited to vertices that can 
jointly belong to a MIC and to their respective regulators. Finally, the last two rules and the 
integrity constraint perform the same test as in ( fTol i for a restricted set of pairs W and V. 
(The fact that cycleiV, W) implies V ^ W in labelE'iW, U, V, S) also allows us to drop 
this condition, used in ( fTot . from the bodies of the rules defining receive '.) 

The complete optimized MIC encoding consists of the original rules in (JUl, (O, and ^, 
(fTTI) and ( fT2] l as add-ons, and ( fT3b as a replacement for (fTOb . As regards the computational 
impact, we note that the optimized encoding needs less than two seconds for grounding 
and finding all MICs on the case study in Section l73l which took more than a minute with 
the unoptimized encoding. 

A second version of the optimized encoding is obtained by tightening the considera- 
tion of connected vertices in (F[VF], i?[Ty]) relative to a MIC candidate W. This can be 
achieved by adding condition activeiV) to the rules in ( fTTT l defining the edges predicate. 
In this way, the static reachability information encoded in (fTTI ). which is completely eval- 
uated by grounder gringo, is turned into a dynamic relation computed during search. As 
it turns out, there is no significant performance difference between these two versions of 
the optimized MIC extraction encoding on the case study in Section l73] Hence, more real 
examples are needed to reliably compare their grounding and solving efficiency. 

7 Empirical Evaluation and Application 

For assessing the scalability of our approach, we start by conceiving a parameterizable 
suite of artificial yet biologically meaningful benchmarks. After that, we present a typical 
application stemming from real biological data, illustrating the exertion in practice. All 
experiments were performed using input reduction as explained in Section|6T| 

7.1 Checking Consistency 

We first evaluate our approach on randomly generated instances, aiming at structures simi- 
lar to those found in biological applications. Instances are composed of an influence graph, 
a complete labeling of its edges, and a partial labeling of its vertices. Our random generator 
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Table 2. Run-times for consistency checking with claspD, cmodels, dlv, and gnt. 



takes three parameters: (i) the number a of vertices in the influence graph, (ii) the average 
degree /3 of the graph, and (iii) the proportion 7 of observed variations for vertices. To 
generate an instance, we compute a random graph with a vertices (the value of a vary- 
ing from 500 to 4000) under the model by Erdos-Renyi ( 119591 ). Each pair of vertices has 
equal probability to be connected via an edge, whose label is chosen independently with 
probability 0.5 for both signs. We fix the average degree (3 to 2.5, which is considered to 



be a typical value for biological networks (Jeongetal. 20001. Finally, [70;] vertices are 
chosen with uniform probability and assigned a label with probabiUty 0.5 for both signs. 
For each number a of vertices, we generated 50 instances using five different values for 7, 
viz., 0.01, 0.02, 0.033, 0.05, and 0.1. All instances are available at ( IBioASP Tools! ). 

We used gringo (2.0.0) jGebser et al. 2009al l for combining the generated instances and 
the encoding given in Section|4]into equivalent ground programs. For checking consistency 
by computing an answer set (if it exists), we ran disjunctive ASP solvers claspD (1.1) 
JDrescher et al. 2008) with "Berkmin", "VMTF", and "VSIDS" heuristics, cmodels (3.75) 
([Giunchiglia et al. 2006[) using zchajf, dlv (BEN/Oct 11) (Leone et al. 20061 ). and gnt (2. 1) 



jjanhunen et al. 20061) . All runs were performed on a Linux machine equipped with an 
AMD Opteron 2 GHz processor and a memory limit of 2GB RAM. 

Table|2]shows average run-times in seconds over 50 instances per number a of vertices, 
including grounding times of gringo and solving times. We checked that grounding times 
of gringo increase linearly with the number a of vertices, and they do not vary significantly 
over 7. For all solvers, run-times also increase linearly in aO For fixed a values, we found 
two clusters of instances: consistent ones where total labelings were easy to compute, and 
inconsistent ones where inconsistency was detected from preassigned labels. This tells 
us that the influence graphs generated as described above are usually (too) easy to label 
consistently, and inconsistency only occurs if it is explicitly introduced via fixed labels. 
However, such constellations are not unlikely in practice (cf. Section 17.31 ). and isolating 
MICs from them, as done in the next subsection, turned out to be hard for most solvers. Fi- 
nally, greater values for 7 led to an increased proportion of inconsistent instances, without 
making them much harder. 



^ Longer ran-times of claspD with "BeAmin" in comparison to tlie other heuristics are due to a more expen- 
sive computation of heuristic values in the absence of conflict information. Furthermore, the time needed for 
performing "Lookahead" slows down dlv as well as gnt. 
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Table 3. Run-times for grounding with gringo and solving with claspD. 



7.2 Identifying Minimal Inconsistent Cores 



We now investigate the problem of finding a MIC within the same setting as in the previous 
subsection. Because of the elevated size of ground instantiations and problem difficulty, we 
varied the number a of vertices from 50 to 300, thus, using considerably smaller influence 
graphs than before. We again use gringo for grounding, now taking the encoding given in 
Section |5] As regards solving, we restrict our attention to claspD because all three of the 
other solvers showed drastic performance declines. 

Table |3] shows average run-times in seconds over 50 instances per number a of ver- 
tices. Timeouts, indicated in parentheses, are taken as maximum time of 1800 seconds. 
We observe a quadratic increase in grounding times of gringo, which is in line with the 
fact that ground instantiations for our MIC encoding grow quadratically with the size of 
influence graphs. In fact, the schematic rules in Section 15.31 give rise to a copies of an 
influence graph. Considering solving times spent by claspD for finding one MIC (if it ex- 
ists), we observe that they are relatively stable, in the sense that they are tightly correlated 
to grounding times. This regularity again confirms that, though it is random, the applied 
generation pattern tends to produce rather uniform influence graphs. Finally, we observed 
that unsatisfiable instances, i.e., consistent instances without any MIC, were easier to solve 
than the ones admitting answer sets. We conjecture that this is because consistent total 
labehngs provide a disproof of inconsistency as encoded in Section 15721 

As our experimental results demonstrate, computing MICs is computationally harder 
than just checking consistency. This is not surprising because the related (yet simpler) deci- 
sion problem of verifying a MUC is D^-complete (Dershowitz et al. 2006; Papadimitriou and Yannakakis 1982| l 
and thus more complex than just deciding satisfiabiUty. With our declarative technique, we 
spot the quadratic space blow-up incurred by the MIC encoding in Section |5] as a bottle- 
neck. However, there are approaches aiming at a reduction of grounding efforts, and some 
of them have been presented in Section|6] 



Detecting Inconsistencies in Large Biological Networks with ASP 



19 









Figure 5. Some MICs obtained by comparing the regulatory network of yeast with a ge- 
netic profile. 

7.3 Biological Case Study 

In the following, we present the results of applying our approach to real-world data of 
genetic regulations in yeast. We tested the gene-regulatory network of yeast provided in 
JGuelzim et al. 2002 *) against genetic profile data of snf2 knock-outs dSudarsanam et al. 20001) 
from the Saccharomyces Genome Databasqj. The regulatory network of yeast contains 
909 genetic or biochemical regulations, all of which have been established experimentally, 
among 491 genes. 

Comparing the yeast regulatory network with the genetic profile of snf2, we found the 
data to be inconsistent with the network, which was easily detected using the approach 
of Section |4] Applying our diagnosis technique from Section |5l we obtained a total of 19 
MICs. While computing the first MIC took less than a second using gringo and claspD (re- 
gardless of the heuristic used), the computation of all MICs was considerably harder. Us- 
ing "VMTF" as search heuristic on top of the enumeration algorithm JGebser et al. 2007] ! 
inherited from clasp dGebser et al. 2009ct , claspD had found all 19 MICs in about 30 sec- 
onds, while another 40 seconds were needed to decide that there is no further MIC. With 
"VSIDS", finding the 19 MICs took about the same time as with "VMTF", but another 
80 seconds were used to verify that all MICs had been found. Finally, using "Berkmin" 
heuristic, 12 MICs had been found before aborting after 30 minutes. The observation that 
search heuristics matter tells us that investigations into the structure of biologicalproblems 
and particular methods to solve them efficiently can earn considerable benefitsjj Further- 
more, we note that the potential existence of multiple answer sets encompassing the same 



http : //www. yeast genome .org 
^ Notably, by exploiting additional background knowledge, the optimized encoding presented in Section 16.21 
requires less than two seconds (regardless of heuristics) for grounding and finding all 19 MICs. In fact, its 
ground instantiation contains only 8481 atoms and 10843 rules, compared to 47260 atoms and 56522 rules 
with the basic encoding in Section [3] In addition to problem size, also the difficulty drops dramatically: from 
23345 conflicts down to 270 conflicts, encountered with "VMTF" heuristic during search for all answer sets. 
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topi 

Figure 6. Subgraph obtained by connecting the six MICs given in Figure|5] 

MIC did not emerge on the yeast network and snf2 knock-out data. That is, we obtained 19 
answer sets, each one corresponding one-to-one to a MIC. 

Six of the computed MICs are exemplarily shown in Figure |5] While the first three of 
them are pretty obvious, we also identified more complex topologies. However, our exam- 
ple demonstrates that the MICs obtained in practice are still small enough to be understood 
easily. For finding suitable corrections to the inconsistencies, it is often even more helpful 
to display the connections between several overlapping MICs. Observe that all six MICs 
in Figure |5] are related to gene ume6. Connecting them yields the subgraph of the yeast 
regulatory network in Figure|6] 

The most obvious problem in Figure|6]is that the observed increase of ume6 is incompat- 
ible with its four targets. This suggests that either the observation on ume6 is incorrect or 
that some regulations are missing or wrongly modeled. In the first hypothesis though, one 
should note that the current model cannot explain a decrease of ume6: this would imply an 
increase of sin3 and in turn an increase of rebl, but then there would be no explanation left 
for the variation of hsc82 and rapl. So, in either case, our model should be revised. This is 
not a great surprise: our literature-based network, although very reliable, was presumably 
far from being complete. 

Regarding the biological background, note that ume6 is a known regulator of sporu- 
lation in yeast: in case of nutritional stress, yeast cells stop dividing and produce spores 
by meiosis. These spores are reproductive structures better adapted to extreme condi- 
tions. ume6 is known as a key inhibitor of early meiotic genes: upon entry in meiosis, 
this inhibitory effect is released and the target genes are expressed. Notably, a knock- 
out of ume6 causes the expression of meiotic genes during vegetative growth (hence its 
name. Unscheduled Meiotic Expression) as well as almost complete failure of sporulation 



(Washburn and Esposito 2006 1. ume6 seems to have activation capabilities as well, though 
in that case the effect is believed to be indirect ( Chen et al. 20071 1. 

In the current view, ume6 switches from inhibitor to (indirect) activator at the beginning 
of meiosis: Ume6p (the protein corresponding to the gene ume6) has a repressive effect 
when it forms a complex with Sin3p (note that sin3 is in our network) and Rdp3p, which 



is degraded upon entry in meiosis (Mallory et al. 2007 1. This molecular mechanism can 



be interpreted in our model and one possible result is given in Figure |7] At least for neg- 
ative targets, we now have a plausible explanation: the real effector of the inhibition on 
hsfl, spol2, topi, and ume6 itself is the complex Ume6p-Sin3p, whose variation is un- 
observed but depends on the variation of ume6 and sin3. The variation of the targets can 
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Figure 7. Local correction of the network based on our diagnosis method and literature 
research. 



be explained if the protein complex decreases, which is in turn possible if sin3 decreases. 
Regretfully sin3 is not observed in our data, but we note that a decrease of this gene is fully 
compatible with the rest of the network, that is, if we suppose a decrease of rebl. Now 
concerning ino2, our network should be updated with more recent evidence: as reviewed 
in JChen et al. 2007l l. ino2 has several additional regulators, such as opil and pahl (see 
Figure|7]l- The observed variation of pahl is not useful to explain that of ino2, but opil is 
definitely a plausible candidate. 

Here we illustrated one main usage of our diagnosis technique: identifying poorly mod- 
eled regions of a regulatory network that are incompatible with a given data set. This is 
definitely a key asset if one wants to build a large-scale regulatory database and check 
its coherence with newly produced data on a regular basis. Given new data, our diagnosis 
method produces human-understandable representations of possible incompatibilities with 
the current model, which serve as the basis for a targeted literature research. With this data- 
driven approach, a network can then be improved with considerably less effort than with a 
random traversal of publications, for a much more coherent result. 



8 Web Service 

To make our methods easily accessible to a biological audience, we built a web servicqj 
not requiring any locally installed software on the user side except for a web browser It 
provides the possibility to upload textual representations of biological networks as well as 
experimental profiles. Also, a number of predefined examples allows a user to instantly 
experience the functionalities of the web service. These include consistency checking and 
diagnosis, i.e., finding MlCs, whose implementation has been detailed in Section|4]and|5] 
Influence graphs representing biological networks usually contain vertices that are not 
subject to any regulation. Such entities are understood as controlled by external factors, 
like environmental or particular experimental conditions. To avoid trivial inconsistencies 
due to such unregulated and thus unexplainable vertices, the web interface provides an 



^ |http: //data.haiti ■ cs .uni-potsdam. de/wsgi/app| 
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Figure 8. Representation of identified MICs in textual (left) and graphical (right) mode. 



option "Guess input nodes" for automatically declaring all vertices without any predeces- 
sor as inputs. While consistency checking simply results in a positive or negative answer, 
we offer three diagnosis modes: "find one inconsistency", "find all inconsistencies", and 
"approximate all inconsistencies". The first mode aims at finding a single MIC, and the 
second at finding all of them. For the latter, we currently use an encapsulating script that 
repeatedly calls claspD while feeding already identified MICs back as integrity constraints, 
until no further answer set exists. This makes sure that each answer set corresponds to a 
new MIC and thus avoids potential repetitions. The problem of enumerating answer sets 
that differ on a set of "relevant" atoms (in our case, on instances of predicate active) is 
addressed in (IGebser et al. 2009b l. The integration of this technique into claspD, in or- 
der to make the wrapper script obsolete, is subject to future work. Once MICs have been 
computed, they can be represented either textually or graphically, as shown in Figure |8] 
If the result consists of several MICs, it is possible to view overlapping ones in a com- 
bined way, thus highlighting regions of inconsistency. Finally, the third diagnosis mode, 
"approximate all inconsistencies", works by marking the vertices of a computed MIC as 
inputs before proceeding to look for further MICs. This approach has been used in previous 
work dGuziolowski et al . 2009) and has been integrated into our framework for compari- 
son. However, the results obtained with the third mode depend on the order in which MICs 
are found and their vertices declared to be inputs in future computations. Further func- 
tionahties, hke prediction under consistency dGuziolowski et al. 2007l l and inconsistency 
(IGebser et al. 2010] ). are also featured by the web service but are outside the scope of this 
paper. 
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9 Discussion 

We have provided an approach based on ASP to investigate the consistency between exper- 
imental profiles and influence graphs. In case of inconsistency, the concept of a MIC can be 
exploited for identifying concise explanations, pointing to unreUable data and/or missing 
reactions. The problem of finding MICs is closely related to the extraction of MUCs in the 
context of SAT. From a knowledge representation point of view, however, we argue for our 
ASP-based technique, as it provides an easy way to model a problem in terms of a uniform 
encoding and specific instances. 

The BioQuali system (IGuziolowski et al. 200 9") provides functionalities parallel to our 
approach. It also works on influence graphs and applies the same consistency notion. In 
preprocessing, BioQuali reduces an influence graph by iteratively marking unobserved ver- 
tices that have no successors as uncritical. This technique is also realized by input reduc- 
tion, described in Section ISTI After that, BioQuali transforms the reduced subgraph into a 
Binary Decision Diagram, used for further computations. While consistency checking with 
BioQuali yields the same results as our technique, its diagnosis functionality works like the 
"approximate all inconsistencies" mode, described in the previous section. In contrast to 
our method, this does in general not admit finding all MICs. 

By now, a variety of efficient ASP tools are available, both for grounding and for solving 
logic programs. Our empirical assessment of them (on random as well as real data) has in 
principle demonstrated the scalability of the approach. The web service implementation of 
finding all MICs, which is genuine to our method and not available in any other existing 
tool, is still based on some workarounds for avoiding redundant answer sets. It is a subject 
of future work to address this with answer set projection dGebser et al. 2009bl l. 

As elegance and flexibility in modeling are major advantages of ASP, our current appli- 
cation makes it attractive also for related biological questions, beyond the ones addressed 
in this paper. For instance, ongoing work deals with repair and prediction under consis- 
tency as well as inconsistency ( Gebser et al. 201 0^. In future, it will also be interesting to 
explore how far the performance of ASP tools can be tuned by varying and optimizing 
encodings for particular tasks. In turn, challenging applications like the one presented here 
might contribute to the further improvement of ASP tools, as they might be geared towards 
efficiency in such domains. 

Appendix A Proof of Theorem 14.11 and 14.21 

We formalize the representation of instances, as described in Section 14.11 by defining a 
mapping t of an influence graph (V, E, a) and a (partial) vertex labeling n : V -^ {+, -}: 



t((V, £',ct), /i) = {vertex{i). 
U {edge{j,i). 
U {observedE{j,i,s). 
U {observedV{i^s). 
U {input{i). 



ij^i) eE,a{j,i) :=s} 

i ^ V, fj,{i) = s} 

i G V" is an input} . (Al) 



By Pc, we denote the encoding containing the schematic rules in (O, Q, ^, and ^. 
Proof of Theorem \4.1\ 
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Assume that X is an answer set of Pc U t(( V, E, cr), /i). Furthermore, let 

P^ = {{head{r) ^ body{r) + )6 \ 

r e PcUT{(y,E,a),fi),{body{r)-e)r\X ^$,6 : var{r) -^U} 

where var{r) is the set of all variables that occur in a rule r, U is the set of all constants 
appearing in Pc U t{{V, E, cr), /i), and is a ground substitution for the variables in r. 
Then, by the definition of an answer set, we know that X is a C -minimal model of P^ . 
Given X, we define a' and /i' as follows: 

a' ^ {{j ^i) h^ s \ {j -^i) e E, labelE{j, i, s) E X} 
^' = {i ^ s \ i eV, labelV{i, s) e X} . 

We show that a' and /i' are total labelings of edges and vertices, respectively, such that 
/i'(i) = lJ,'{j)(j'{j, i) holds for every non-input vertex i £ V and some edge j — s- i in E. 

Regarding the uniqueness of labels assigned by a' and ^', consider the following rules 
from ^ and & including predicates labelE and labelV in their heads: 

labelV(V,+); labelV{V,-) ^ vertex{V). 
labelE{U, V, +);labelE{U, V, -) ^ edge{U, V). 

labelV{V, S) 4- obsen'edV{V, S). 
labelE{U, V, S) <~ observedE{U , V, S). 

Since the given (partial) labelings a and /i assign unique labels to the elements of their 
domains, facts defining observedE and observedV are of the form observedE{j, i, +). or 
observedE{j, i, -). and observedV{i, +). or observedV{i, -)., respectively, and at most one 
of these facts is contained in t{{V, E, a), /i) for an edge (j — s- i) G E or a vertex i £ V. 
Because X is a C-minimal model of P^ , the atoms in the heads of facts are in X, 
and all atoms in X over predicates observedE and observedV are derived from facts in 
T{(y, E, <j), fi), in view that these predicates do not occur in the head of any rule in Pq. 
Hence, at most one of the atoms labelE{j,i,+) and labelE{j,i,-) or labelV{i,+) and 
labelV{i,-), respectively, is derivable for an edge (j -^i) € E or vertex i £ V from a 
ground instance of the fourth or third rule in ( IA2l l and then included in X. Furthermore, 
the second and first rule in (IA2b impose that at least one of labelE{j, i,+) or labelE{j, i,-) 
and labelV{i,+) or labelV{i,-) belongs to X for every edge {j-^i) G E and vertex 
i £ V, respectively, while the atom containing the opposite label cannot belong to a C- 
minimal model of P^ . Hence, there is at most one term s such that labelE{j, i,s) £ X 
or labelV{i, s) £ X for an edge (j — > i) G i? or vertex i £ V, respectively, and it holds 
that s £ {+, -}, which allows us to conclude that a' and /i' are total labelings. 

As regards extending a and n, we have that fact observedE{j, i, s). or observedV{i, s). 
belongs to r^iV, E, a), fj,) if a{j, i) = s or /i(i) = s, respectively, is given. This implies 
that labelE{j, i,s) G X or labelV{i, s) £ X, respectively, as the fourth or third rule in (IA2l l 
would be unsatisfied otherwise. Thus, <j'{j, i) = s if a{j, i) = s, and /i'(i) = s if ^(z) = s. 

It remains to be shown that /i'(i) is consistent for each non-input vertex i £ V.To this 
end, we note that the integrity constraint 

■<— labelV{V, S), not receive(V, S), not input{V). 

from ^ necessitates receive{i,r) G X if /^'(i) — r (that is, if labelV{i,r) £ X) for a 
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non-input vertex i ^ V. Otherwise, P^ would contain an unsatisfied ground instance in 
view that input{i) e X exactly if fact input{i). is included in r((V, E, a), /i). However, 
any ground instances of the integrity constraint contributing to P^ do not contain atoms 
over predicate receive. Such atoms can only be derived using the following rules from ©I 

receive{V, +) ^ labelE{U, V, S),labelV{U, S). 
receive{V, -) ^ labelE{U, V, S), labelV{U, T),S ^T. 

Since X is a C-minimal model of P'^, receive{i. +) G X or receive{i, -) G X is possible 
only if labelE{j,i,s) G X and labelV{j,t) G X such that s = t or s y^ t, that is, if 
a'(j,i) = s and /i'(j) = t such that lJ-'{j)(j'{j,i) = + or lJ.'{j)(j'{j,i) = -, respectively. 
As labelV{i, r) is accompanied by receive{i, r) in X for each non-input vertex i ^ V, this 
allows us to conclude that fj,'{i) — r implies lJ,'{j)(y'{j, i) = r for some regulator j of i. 
Hence, we have that /i'(i) is consistent for each non-input vertex i E V. D 



Proof of Theorem \4.2\ 

Assume that (V, E, a) and /i are consistent. Then, there are total extensions a' : E -^ 
{+, -} of a and jj! -.V ^ {+,-] of ^ such that, for each non-input vertex z G V^, we have 
IJ.'{i) = M'(j)f '(:/, i) for some edge j — > i in E. 
We consider the following set X of atoms: 

X = {vertex{i), labelV{i, s) | i G V, n'{i) = s} 

U {edge{j, i), labelE{j, i, s) \ {j ~^i) e E, a'{j, i) = s} 

U {receive{i, ts) \ {j -^i) e E, a'{j, i) — s, ^,'{j) = t] 

U {observedE{j, i, s) \ [j -^i) £ E, a{j, i) — s} 

U {observedV{i, s) | i G V, /i(i) = s} 

U {input{i) I i G P^ is an input} . 

For showing that X is an answer set of Pc U t{{V, E, a), fi), we need to verify that X is 
a C-minimal model of 

P^ = {(headir) ^ body{r)+)e 

r £ Pc U t{{V, E,a), fi), {body{r)-9) n X ^ 9,0 : var{r) -^ U} 

where var{r) is the set of all variables that occur in a rule r, U is the set of all constants 
appearing in Pc U t((V, E, ct), /i), and 6* is a ground substitution for the variables in r. 

To start with, we note that X includes an atom vertex{i), edge{j, i), observedE{j, i, s), 
observedV{i, s), and input{i), respectively, exactly if there is a fact with the atom in the 
head in t{{V, E, a), ij.)- Each of these facts belongs also to P^, is satisfied by X, but not 
by any set Y of atoms excluding at least one of the head atoms. Furthermore, since a' 
and /i' are total mappings, we have that \{labelE{j,i,+),labelE{j,i,-)} n X| = 1 and 
\{labelV{i,+),labelV{i,-)} Ci X\ = I for every (j^i) G E and i e V, respectively. 
Hence, X, but no subset Y of X excluding at least one atom over predicates labelE and 
labelV, satisfies all ground instances of the following rules from (O in P^: 

labelV{V,+);labelV{V,-) <- vertex{V). 
labelE{U, V, H-); labelE{U, V,-) ^ edge{U, V). 

In addition, since a' and /x' extend a and /x, respectively, all ground instances of the fol- 
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lowing rules from (HI in P^ are satisfied by X: 

labelV{V, S) i- observedV{V, S). 
labelE{U, V, S) ^ observedE{U, V, S). 

Since labelE{j, i,s) £ X and labelV{j, t) £ X if (j'{j, i) = s and ^'(j) = t, respectively, 
we have that receive{i, ts) G X exactly if there is a ground instance of the rules 

receive{V, +) ^ labelE{U, V, S),labelV{U, S). 
receive{V, -) ^ labelE{U, V, S), labelV{U, T),S ^T. 

from (|6]) in P^ such that labelE{j, i,s), labelV{j, t) £ X occur in the body and receive{i, ts) 
in the head. Hence, no subset F of X excluding any atom over predicate receive is a model 
of P'^. Finally, since /i'(i) = /i'(j)(T'(j,i) for each non-inputvertexi G V^ and some j^- i 
in E, labelV{i, r) £ X implies that receive {i, r) £ X . That is, the ground instances of the 
integrity constraint 

<— labelV{V, S), not receive{V, 5), not input{V). 

from ((Tjl that contribute to P^ are satisfied by X. 

We have now investigated all rules in Pc U t{{V, E, a), /i) and shown that their ground 
instances in P^ are satisfied by X. Furthermore, we have checked for all atoms in X that 
they cannot be excluded in any model Y C X of P^ . That is, X is indeed a C-minimal 
model of P^ and thus an answer set of Pc U t((V, E,a), fj,). D 

Appendix B Proof of Theorem 15.11 and 15.21 

This appendix provides proofs for soundness and completeness of the MIC extraction en- 
coding in Section |5] We use t{{V, E, a), ^) as defined in (lAll i to refer to the facts rep- 
resenting an influence graph {V,E,a) and a (partial) vertex labeling /i : V^ — > {+,-}. 
By Pd, we denote the encoding consisting of the schematic rules in (IH, (H), (|9]l, and ( fTol i. 
As an auxiliary concept, for any subset W Q V, v^t say that a' : E ^ {+;-} ^nd 
/i' : V^ — > {+, -} are witnessing labelings for W if the following conditions hold; 

1 . <j' and /i' are total, 

2. if (7{j, i) is defined, then a'{j, i) = (t{j, i), 

3. if /i(i) is defined, then /i'(i) = /^(i), and 

4. /i'(i) is consistent (relative to a') for each non-input vertex i £ W. 

The above conditions make sure that a' and /i' are total extensions of a and /i, respectively, 
such that the variations of vertices in W are explained. Comparing Definition l5.1l the first 
condition requires the absence of witnessing labeUngs for a MIC W, while the second 
condition stipulates the existence of witnessing labelings for each W' C W. 



Proof of Theorem 15.71 

Assume that X is an answer set of Pd U r((V, E, cr), /i). Furthermore, let 

P^ = {{head{r) ^ hody{r)+)6 \ 

r £ PDUT{{V,E,a),fi),{body{r)-e)nX = (/},e : varir) -^ U} 

where var{r) is the set of all variables that occur in a rule r, U is the set of all constants 
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appearing in Pjj U t{{V, E, a), /i), and is a ground substitution for the variables in r. 
Then, by the definition of an answer set, we know that X is a C -minimal model of P^ . 
Let VF = {i I active{i) e X}. We have to show that the following conditions hold: 

1 . There are witnessing labelings for each W' C W . 

2. There are no witnessing labelings for W . 

We below consider these conditions one after the other. 

Condition 1. Let W' — W\{k} for any k £ W. Furthermore, define a' and /i' as follows: 

a' = {(j->i) ^ s \ {j ^i) e E,labelE'{k,j,i,s) £ X} 

U{{j^i)^+\ (j^i) e E,labelE'{k,j,i,+) ^ X,labelE'{k,j,i,-) ^ X} 

fi' = {i^ s \i eV, labelV'{k, i, s) G X} 

U {i^+\i£ V,labelV'{k,i,+) (^ X,labelV'{k,i,-) i X} . 

We show that a' and /i' are witnessing labelings for W' . 

Regarding the uniqueness of labels assigned by a' and /i', consider the following rules 
from dTol ) including predicates labelE' and labelV in their heads: 



labelV'{W, V, +); labelV'{W, V, -) 4- active{W), vertexMIC(V). 
labelE'iW, U, V, +); labelE'{W, U, V, -) i- active{W), edgeMIC{U, V). 

labelV'iW, V, S) i- active{W), observedV{V, S). 
labelE'iW, U, V, S) ^ active{W), observedE{U, V, S). 



(Bl) 



Since the given (partial) labelings a and /i assign unique labels to the elements of their 
domains, facts defining observedE and observedV are of the form observedE{j, i, +). or 
observedE{j, i,-). and observedV{i, +). or observedV{i, -)., respectively, and at most one 
of these facts is contained in t((F, E, a), fi) for an edge (j -^i) € E or vertex i E V. 
Because X is a C-minimal model of P^ , the atoms in the heads of facts are in X, 
and all atoms in X over predicates observedE and observedV are derived from facts in 
t{{V, E, (j), fi), in view that these predicates do not occur in the head of any rule in Pp. 
Hence, at most one of the atoms labelE'{k, j, i, +) and labelE'{k, j, i, -) or labelV'{k, i, +) 
and labelV'{k,i,-), respectively, is derivable for an edge {j ^i) E E or vertex i £ V 
from a ground instance of the fourth or third rule in ( IB II ) and then included in X. If either 
of labelE'{k,j,i,+) and labelE'{k,j,i,-) or labelV'{k,i,+) and labelV'{k,i,-), respec- 
tively, is included in X, then the ground instance of the second or first rule in jBlb for k and 
an edge [j ^i) € i? or vertex z € V is satisfied, so that the atom containing the opposite 
label cannot belong to a C-minimal model of P^ . Hence, there is at most one term s such 
that cr'(j, i) = s or fi'{i) = s for an edge {j -^i) E E or vertex i € V, respectively, and 
it holds that s G {+, -}. Furthermore, looking at the definitions of a' and /i', it is obvious 
that both are total, which allows us to conclude that a' and /i' are total labelings. 

As regards extending a and /i, we have that fact observedE{j, i, s). or observedV{i, s). 
belongs to t{{V, E, a), fi) if ^(j, i) = s or /i(i) = s, respectively, is given. Along with the 
premise that active{k) £ X, this implies that labelE'{k, j,i,s) £ X or labelV'{k, i.s) £ X, 
respectively, as the fourth or third rule in (IBll) would be unsatisfied otherwise. Hence, we 
havecr'(j, i) = s iia{j,i) = s, and /x'(i) = s ii n{i) = s. 
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It remains to be shown that /i'(i) is consistent for each non-input vertex i £ W'. To 
estabhsh this, we first consider the following rules from dHJ: 

edgeMIC{U, V) ^ edge{U, V),active{V). 
vertexMIC{U) ^ edgeMIC{U, V). (B2) 

vertexMIC{V) ^ actively). 

In view that fact edge{j,i). belongs to r((y,_E, cr),/i) for every (j-^i) £ E, we con- 
clude that edge{j, i) G X. Along with active{i) £ X for every i £ W, it follows that 
edgeMIC{j, i) e X for every (j -^i) £ E such that i e W, and vertexMIC(i) £ X for 
every i £ W . The last observation and the first rule in (IBlb imply that labelV'{k, i,+) E X 
or labelV'{k, i,-) £ X for every i £ W. For i £ W', i.e., i ^ k, the integrity constraint 

^ /fl/7e/y'(Ty, y, 5), flcf/ve(y), V ^ W, not receive\W, V, S). 

from ( [Tol l imposes receive' {k, i,+) £ X if labelV'{k,i,+) £ X, and receive '{k,i,-) £ X 
if labelV'{k, i,-) £ X, while any ground instances of the integrity constraint contributing 
to P^ do not contain atoms over predicate receive '. Such atoms can only be derived using 
the following rules from ( fTOl) : 

receive\W, V, +) ^ labelE'{W, U, V, S), labelV{W, U, S), V ^ W. 
receive'iW, V, -) ^ labelE'{W, U, V, S), labelV'{W, U, T),V ^W,S ^ T. 

Since X is a C-minimal model of P^ , receive'{k,i,+) £ X or receive'{k,i,-) £ X 
is possible only if labelE'{k,j,i,s) £ X and labelV'{k,j,t) £ X such that s ^ t 
or s ^ i, respectively. Comparing T{{V,E,a),iJ,) and the rules in (IBll) . (IB2I) . as well 
as ( IB3I ) reveals that (j-^i) E E is a necessary condition for labelE'{k,j,i,s) £ X, 
and the same applies to j E V and labelV'{k, j, t) E X. By the construction of a' and /j,', 
labelE'{k,j,i,s) E X implies that (t'(j, i) = s and labelV'{k,j,t) E Xthat/i'(j) = i. We 
conclude that receive'{k, i,+) E X or receive'{k, i,-) EX necessitates fJ.' {j)cr'{j, *) = + 
or lJ.'{j)cr'{j, i) — -, respectively, for some regulator j of i. Finally, we have /i'(i) = H- if 
labelV'{k,i,+) E X (and receive'{k,i,+) E X), and /i'(i) = - if labelV'{k,i,-) E X 
(and receive '{k,i,-) E X). This shows that i receives some influence matching /i'(i), so 
that iJ,'{i) is consistent. Since i E W is arbitrary, <t' and /i' are witnessing labelings for W'. 
To conclude the proof of the first condition to verify, we note that witnessing labelings 
for W' are also witnessing labelings for all subsets of W'. Hence, it is sufficient to check 
the existence of witnessing labelings for sets W' = W \ {k} for any k E W. As shown 
above, an answer set X of Pd U T{{V,E,a),fi) yields witnessing labelings for them. 
Hence, the second condition in Definition 15 . 1 I holds for VK = {i | active{i) E X}. 

Condition 2. We now show by contradiction that there cannot be witnessing labelings 
for W. To establish this, we first note that vertices in W cannot be input because, if fact 
input{i). belongs to t{{V, E, a), fi), then input{i) must be included in X, so that the rule 

active{V); inactive{V) ■^ vertex{V), not input{V). (B3) 

from ^ does not contribute a ground instance for i to P^ . Since active{i) cannot be 
derived from any other ground rule in P^ , the fact that X is a C-minimal model of P^ 
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implies that active{i) ^ X for any input vertex i. Furthermore, the integrity constraint 

<r- not bottom. (B4) 

from ^ necessitates bottom G X because X cannot be a model of P^ otherwise. Then, 
we get labelV{i, +), labelV{i, -) ^ X and labelE{j, i, +),labelE{j, i,-) G X for all ver- 
tices i G V and edges {j -^i) E E, respectively, due to the following rules from (|9|l: 

labelV{V, +) <— bottom, vertex{V). 
labelV{V,-) <— bottom, vertex{V). 
labelE{U, V, +) ^ bottom, edge{U, V). ' 

labelE{U, V, -) ^ bottom, edge{U, V). 
We now show that the existence of witnessing labelings for W yields a contradiction 
to the fact that X is a C-minimal model of P^ . To this end, assume that a' and /j,' are 
witnessing labelings for W. Then, let 

Y ^{X\ {{bottom} 

U {labelV{i, s) \ labelV{i, s) £ X} 

U {labelE{j,i,s) \ labelE{j,i,s) E X} 

U {opposite{j , i) \ opposite{j, i) G X})) 
U {labelV{i, s) \ i £ V, ^i'{i) = s} 
U {labelE{j,i,s) \ (j^i) G E,a'{j,i) = s} 
U {opposite{j,i) I ij^i) G E,n'{i) ^ m'(j>'(J,*)} ■ 

Since bottom G X\Y and X contains a maximum amount of atoms over predicates labelV, 
labelE, and opposite (the atoms over opposite are consequences of the inclusion of atoms 
over labelV and labelE), we have that Y C X, and we show that F is a model of P^ . 

Considering the contributions of the facts in t{{V, E, a), ji) and the rules in dTol l to P^, 
we observe that the atoms over predicates occurring in them are interpreted the same in X 
and Y . Hence, such facts and rules stay satisfied by Y because they were already satisfied 
by X. The same applies to the rules from (H) repeated in (IB2I) and ( IB3l l. Furthermore, 
since a' and /i' are total and extend a and /i, respectively, the contributions of the following 
rules from (|4|i and ^ to P^ are satisfied by Y: 

labelV {V, S) ^ observedV{V, S). 
labelE{U, V, S) ^ observedE{U, V, S). 

labelV {V,+); labelV {V,-) ^ vertexMIC{V). 
labelE{U, V, +);labelE{U, V, -) ^ edgeMIC{U, V). 

Since the integrity constraint in ( IB4l i does not belong to P^ and the rules in ( IB5I ) are 
satisfied by Y in view of bottom ^ Y, it remains to consider the following rules from ©I 

opposite{U, V) ^ labelE{U, V, -), labelV{U, S),labelV{V, S). 
opposite{U, V) ^ labelE{U, V, +), labelV {U, S),labelV{V, T),S ^T. 

bottom -s— active(V), opposite(U, V) : edgeiU, V). 

The rules defining predicate opposite are such that, in order to satisfy their ground instances 
in P^ , Y must contain opposite{j, i) if labelE{j, i, r), labelV{j, s), and labelV{i, t) be- 
long to Y such that t ^ sr. This matches the definition of Y, including labelE{j,i,r) 
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if a'{j,i) = r, labelV{j,s) if fJ,'{j) — s, labelV{i,t) if fJ,'{i) = t, and opposite{j,i) if 
H'{i)^IJ,'{j)a'{j,i). Hence, rules defimng opposite in P^ are satisfied by Y. It remains to 
be shown that bottom is not derivable from any ground instance of the last rule. In this re- 
gard, recall that VF = {i | active{i) S X} = {i \ active{i) £ y}, and we have seen above 
that active{i) can only belong to X if z is not an input. As a' and /i' are witnessing labelings 
for W, for every i € W, there is an edge {j -^i) E E such that //'(i) = tJ-'{j)cr'{j, *)■ ^Y 
the definition of Y, this implies opposite{j, i) ^ Y, while edge{j, i) belongs to X and Y 
because X and Y are models of t{{V, E, a), n). As a consequence, for every i G W, we 
have {opposite{j,i) \ edge{j,i) € Y} % Y, so that the ground instance for i in P^ of 
the rule with bottom in the head is satisfied by Y. We have thus established that Y C X 
is indeed a model of P^ , a contradiction to the assumption that X is a C -minimal model 
of P^ and an answer set of P^ U t{{V, E, cr), /i). 

The above contradiction shows that the second condition to verify, which is the first 
condition in Definition 15. II holds for W — {i \ active{i) G X}. The fact that the second 
condition in Definition IS . 1 I holds for W has been shown before. Hence, M^ is a MIC. D 



Proof of Theorem]. 

Assume that W — {/ci, . . . , fc„} is a MIC. Then, the following conditions hold: 



1. There are witnessing labehngs cti, //i, . . . , cr„, ^„ for VF \ {fci}, . . . ,W \ {fcn}. 

2. There are no witnessing labelings for W . 
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We consider the following set X of atoms: 



X = {vertex{i) 
U {edge{j, i) 
U {observedE(j,i, s) 
U {observedV{i, s) 
U {input{i) 
U {active(i) 
U {inactive{i) 
U {edgeMICiJ, i) 
U {vertexMICij) 
U {vertexMIC{i) 
U {labelE'{km,j,i,r) 
U {labelE'{km,j,i,r) 
U {labelV'{km,j, s) 
U {labelV'{km, h ■s) 
U {labelV'{km, i, s) 
U {receive'{km,i,sr) 

U {receive'{km,i,sr) 

U {receive'{km,i,sr) 

U {labelV{i,+),labelV{i,-) 
U {labelE{j, i, +), labelE{j, i, -) 
U {opposite{j, i) 
U {bottom} . 



leV} 

{j -^i)eE, a{j, i) = s} 
i G V^,^(i) = s} 
i G F is an input} 

i G F \ W^ is not an input} 

(.?■ ^ i) G -B, i G V7} 

(j ^ i) G -B, i G W} 

iG VT} 

(j ^ i) e -S: * e W^j cr,mO'> i) = r, 1 < m < n} 

(.? ^ *) G ^: crOi i) = r,l < m < n} 

{j ^ i) e -B, i G VK, ^™(j) = s, 1 < m < n} 

i dW, ^im{i) = s, 1 < TO < n} 

i (z V, ii{i) — s,l < m < n} 

<^m{j, i) = r, fimii) = s,i^ k„i, I <m <n} 

{j -^i) e E,j eW or {j -^k) e Efork eW, 

a{j, i) = r, ii„i(j) = s, i ^ fc„, 1 < m < n} 

{j^i)eE, 

a{j, i) = r, n{j) ^ s,i ^ kmA < m < n} 

icV} 

{j^i)eE} 

{j^t)eE} 



For showing that X is an answer set of Pq U t((V, E, a), /i) (such that {i \ active{i) G 
X} — W), we need to verify that X is a C-minimal model of 



P^ = {{head{r) ^ body{r)+)6 

rePD^r({V,E,a),n),{body{r)-9)nX 



5,6* : var{r) -^ U} 



where var{r) is the set of all variables that occur in a rule r, U is the set of all constants 
appearing in Po U t{{V, E, (t), ij.), and 6* is a ground substitution for the variables in r. 

To start with, we note that X includes an atom vertex{i), edge{j, i), observedE{j, i, s), 
observedV{i, s), and input{i), respectively, exactly if there is a fact with the atom in the 
head in t((V, E, ct), /i). Each of these facts belongs also to P^ , is satisfied by X, but not 
by any set Y of atoms excluding at least one of the head atoms. 

In view that W cannot contain any input (otherwise, satisfaction of the second condition 
in Definition 15 . II would immediately imply violation of the first one), we have that either 
active{i) or inactive{i) belongs to X for every non-input vertex i eV . Hence, X satisfies 
all ground instances of the rule 



active(V); inactiveiV) ^ vertex{V), not input{V). 
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from (O belonging to P^ , while no set Y of atoms excluding both active{i) and inactive{i) 
for any non-input vertex i ^V satisfies all of these ground instances. 
Considering ground instances of the rules 

edgeMIC{U, V) ^ edge{U, V),active{V). 
vertexMIC{U) ^ edgeMIC{U, V). 
vertexMICiy) ^ active(V). 

from (l8]l, all of them belong to P^ , are satisfied by X, but not by any set Y of atoms such 
that {edgeMIC{j, i) \ edgeMIC{j, i) e X} U {vertexMIC(i) \ vertexMIC{i) ^ X} %Y 
and {active(i) \ active{i) E X} C {active{i) \ active{i) E Y}, while it has been 
shown above that {active{i) \ active{i) G X} % {acti\'e(i) \ activeii) G y} necessi- 
tates \inactive(i) \ inactive(i) € Y^ 2 \inactive(i) \ inactive[i) S X} for Y being a 
model of P^ . Hence, there cannot be any model F C X of P^ excluding some atom 
edgeMIC{j, i) or vertexMIC{i) that belongs to X. 

Now turning our attention to atoms of form labelE'(km,j,hr) and labelV'{km,j,s), 
we note that they are included in X if edgeMIC{j, i) € X and vertexMIC{j) £ X, respec- 
tively, and (J,n{j, i) = r, fimU) = s in witnessing labelings cr„j and ^„i for W \ {k^}, 
where 1 < to < n, or if a{j, i) — r, /i(j) — s. Then, the fact that active{km) G X and 
labels assigned by cjm and /i,„ are unique and respect those assigned by a and fi implies 
that none of the atoms can be removed from X without violating some ground instance of 
the rules 

labelV'iW, V, +); labelV'{W, V, -) ^ active{W), vertexMIC(V). 
labelE'iW, U, V, +); labelE'{W, U, V, -) ^ active{W), edgeMIC{U, V). 

labelV'iW, V, S) ^ active{W), observedV{V, S). 
labelE'iW, U, V, S) ^ active{W), observedE{U, V, S). 

from ([Tol l that belongs to P^ . However, X satisfies all of these ground instances by its 
construction. We further consider the following rules from ( fTOl ): 

receive'iW, V, +) ^ labelE\W, C/, V, S),labelV'{W, U,S),V ^W. 
receive'iW, V,-) <- labelE'{W, [/, V, S),labelV'{W, U, T),V ^W,S ^ T. 

As shown above, labelE'{km,j,i,r) belongs to X if i G W and am{j,i) = r, or if 
(j{j,i) — (Tm{j,i) — r. Furthermore, labelV'{km,j,s) is included in X if j G W or 
{j -^ k) G E,k G W and Umij) — s, ot if fi{j) = fim{j) = s. Comparing the cross 
product of these conditions to the definition of X yields that an atom receive' {km, i, sr) 
belongs to X exactly if labelE'{k„i,j,i,r) and labelV'{km,j,s) are in X and i ^ km- 
Hence, whenexcludingany of the atoms receive' {km, i, sr) fromX, some ground instance 
of the above two rules belonging to P^ becomes unsatisfied, and so we have that such 
atoms cannot be removed from X in order to construct a model Y C X of P^ . Moreover, 
the fact that am and /!„ are witnessing labelings for W' = T4^ \ {km} implies that all 
ground instances of the integrity constraint 

^ labelV'iW, V, S), active{V),V ^ W, not receive'{W, V, S). 

from ( [Tol l that belong to P^ are satisfied by X. In fact, for every i G W , there is some 
(j->i) G E such that ^m(i) = lim{j)crm{j,i)- Since labelE'{km,j,i,crm{j,i)) and 



• 
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labelV'{km,j, IJ-m{j)) belong to X, this implies that each atom labelV'{km, i, l^mii)) for 
i £ W is accompanied by receiVe'(fcm,i,/im(z)) = receive '{km, «,Mm(j)o'm(j, ?)) in X, 
so that the ground instance for km, i, and /im(i) of the integrity constraint is not in P^ . 

Finally, we consider atoms of the form labelV{i, s), labelE{j, i, s), and opposite{j, i) 
that belong to X for alH e ^ and (j -^i) ^ E, respectively, and s e {+, -}. Since bottom 
is also in X, it is clear that the ground instances of the following rules from ^, ^, and ^, 
all of which belong to P^ , are satisfied by X: 

labelV{V, S) ^ observedV{V, S). 
labelE{U, V, S) ^ observedE{U, V, S). 
labelV{V,+);labelV{V,-) <- vertexMIC{V). 
labelE{U, V, +); labelE{U, V, -) ^ edgeMIC{U, V). 

opposite{U, V) ^ labelE{U, V, -), labelV{U, S), labelV{V, S). 
opposite{U, V) ^ labelE{U, V, +), labelV{U, S), labelV{V, T),S ^T. 

bottom <r- active{V), opposite{U , V) : edge{U, V). 

labelV{V,+) ^ bottom, vertex{V). 

labelViy,-) <— bottom, vertexiy). 
labelE(U, V, +) ^ bottom, edge(U, V). 
labelE{U, V, -) <— bottom, edge(U, V). 

As shown above, any model Y Q X of P^ must necessarily include observedV{i, s) 
if /j,(i) = s, observedE{j,i,s) if a{j,i) = s, vertexMIC{i) if i G W or [i^k) G E 
for some k £ W, edgeMIC{j,i) if (j^i) G E for some i G W, and active{i) if 
i G W. Proceeding by proof by contradiction, assume that there is a model Y C X 
of P^ such that labelV{i, s), labelE{j, i, s), or opposite{j, i) is not in Y for some i £ V 
or (j -^i) G E, respectively, and s G {+,-}■ From the previous considerations and the 
first two rules repeated above, we know that labelV{i, s) and labelE{j, i, s) must belong 
to Y if /i(i) = s or (j{j,i) — s, respectively. Furthermore, the third rule necessitates 
{labelV{i, +), labelV{i, -)} n F ^ for every i <E W or i e V such that (i^k) € E for 
some k G W, and the fourth rule implies {labelE{j, i, +), labelE{j, i, -)}nF ^ forevery 
(j -^i) £ E such that i G W . In view of the last four rules, we immediately conclude that 
bottom ^ y, which in turn implies that, for every i G W , there is some [j ^i) G E such 
that opposite{j, i) does not belong to Y . Comparing the rules defining opposite, the ex- 
clusion of opposite{j, i) is possible only if Y does not include labelE{j, i,r), labelV{j, s), 
and labelV{i, t) such that t ^ sr. As we have shown above that some atoms labelE{j, i, r), 
labelV{j, s), and labelV{i, t) for r,s,t G {+, -} must belong to Y, we can now conclude 
that t — sr holds and that the atoms over predicates labelE and labelV in Y define (partial) 
labelings a' and /i' by: 

For every i G W, pick some edge (j -^i) E E such that opposite{j, i) does not belong 
to Y, and let a'{j,i) = r if labelE{j,i,r) G Y, p! [j) = s if labelV{j,s) G Y, and 
/i'(i) ^ t if labelV{i,t) G Y. 

As we have seen above, such an edge (j — > i) G E exists for every i G W, and the fact that 
i 7^ sr is not obtained for atoms labelE{j, i, r), labelV{j, s), and labelV{i, t) in Y implies 
that a' and fj,' assign unique labels to (j — > i), j, and i, respectively. When we totalize a' 
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and n' by setting <j'{j,i) = (T{j,i) and /i'(i) — ii{i) if o-{j,i) or /i(i), respectively, is 
defined, and a'{j, i) — + as well as /i'(i) — + for all remaining edges in E and vertices 
in V, we obtain witnessing labelings for W. But this is a contradiction to the fact that W 
is a MIC, which allows us to conclude that there cannot be any model Y C X of P^ 
that omits labelV{i, s), labelE{j, i, s), or opposite{j, i) for some i e V or {j ^i) G E, 
respectively, and s £ {+,-}• 

To conclude the proof that X is a C -minimal model of P^ , note that the integrity con- 
straint 

•(— not bottom. 

from (|9]l does not contribute any rule to P^ because bottom G X. We have now investi- 
gated all rules in Pd U t((V, E, ct), /i) and shown that their ground instances in P^ are 
satisfied by X. Furthermore, we have checked for all atoms in X that they cannot be ex- 
cluded in any model Y C X of P^ . That is, X is indeed a C-minimal model of P^ and 
thus an answer set of P^ U t((V, E, a), js). D 
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